ClickEnhance: Efficient 3D Interactive Segmentation with Click-Specific Encoder and Contrastive Learning

, , ,

In interactive point cloud segmentation, users can achieve higher accuracy object masks than in instance segmentation by performing limited positive and/or negative clicks on the objects of interest in the scene. Existing methods often employ sparse click representations, leading the model to focus more on local detail features around the click points and failing to fully exploit the guidance information provided by each click, thus impacting the click effectiveness. We utilize a dense representation that reflects spatial distance relationships, known as the distance map, as the click channel to tackle the sparsity problem of click representation in current approaches. Based on the distance map, we introduce ClickEnhance, which is designed to maximize the guiding impact of each click. The proposed method encompasses the design of a click-specific encoder and the utilization of contrastive learning. The Click-Specific Encoder ensures that the network can adequately consider the influence of individual clicks during the feature encoding phase. Contrastive learning, on the other hand, reduces the feature distance between the click points and the target object, thus simplifying the subsequent segmentation process. Experimental results demonstrate that the ClickEnhance method markedly improves segmentation performance across multiple datasets, exhibiting superior generalization capabilities on challenging datasets compared to the state-of-the-art methods. This allows for the generation of high-precision object-level masks with fewer interactions, indicating great potential for practical applications.

» Read on
Yueyang Wen, Yiwen Hou, Shuheng Zhang, Feng Wu. ClickEnhance: Efficient 3D Interactive Segmentation with Click-Specific Encoder and Contrastive Learning. IEEE Transactions on Multimedia (IEEE TMM), :1-12, January 2026.
Save as file
@article{WHZWtmm26,
 author = {Yueyang Wen and Yiwen Hou and Shuheng Zhang and Feng Wu},
 doi = {10.1109/TMM.2026.3651127},
 journal = {IEEE Transactions on Multimedia (IEEE TMM)},
 month = {January},
 pages = {1-12},
 title = {ClickEnhance: Efficient 3D Interactive Segmentation with Click-Specific Encoder and Contrastive Learning},
 year = {2026}
}